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Abstract 

International trade lias been increasingly organized in the form of global value chains 
(GVCs) where different stages of production are located in different countries. This 
recent phenomenon has substantial consequences for both trade policy design at the 
national or regional level and business decision making at the firm level. In this paper, 
we provide a new method for comparing GVCs across countries and over time. First, 
we use the World Input-Output Database (WIOD) to construct both the upstream and 
downstream global value networks, where the nodes are individual sectors in different 
countries and the links are the value-added contribution relationships. Second, we in¬ 
troduce a network-based measure of node similarity to compare the GVCs between any 
pair of countries for each sector and each year available in the WIOD. Our network- 
based similarity is a better measure for node comparison than the existing ones because 
it takes into account all the direct and indirect relationships between country-sector 
pairs, is applicable to both directed and weighted networks with self-loops, and takes 
into account externally defined node attributes. As a result, our measure of similarity 
reveals the most intensive interactions among the GVCs across countries and over time. 
From 1995 to 2011, the average similarity between sectors and countries have clear in¬ 
creasing trends, which are temporarily interrupted by the recent economic crisis. This 
measure of the similarity of GVCs provides quantitative answers to important ques¬ 
tions about dependency, sustainability, risk, and competition in the global production 
system. 

Keywords: Networks, Node Similarity, Input-Output Analysis, Global Value Chains, 
Vertical Specialization, International Trade 
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1 Introduction 


International trade has been increasingly characterized by the content of intermediate in¬ 
puts mm and by the formation of global value chains (GVCs) [S 0| [5, 6j |? :i 8j [9]. Thanks 
to the development of transportation, information, and communications technologies, dif¬ 
ferent stages of production can be allocated and coordinated across borders. For instance, 
merely 3% of the total value-added of China’s exports of iPhones and laptop computers in 
2009 is sourced from China itself, while the remaining 97% is from other countries such as 
the United States, Japan, and South Korea m- 

With global multi-regional input-output (GMRIO) tables becoming available |llj . the phe¬ 
nomenon of GVCs has been explored extensively in recent years by both theoretical mod¬ 
eling [3 EH E] and empirical measurements BUS El IS El IS EM. Although previous 
studies can tell us how ‘global’ the GVCs are by measuring the foreign value-added content 
of exports for a given sector or country, this approach simply ignores the interdependence 
and interconnectedness of the GVCs (as an exception, see |14j where the network structure 
of the GVCs at the sector level is taken into account and simplified by the tree topology). 
The notion of GVCs has been useful in capturing the fact that different stages of produc¬ 
tion are organized across multiple countries, but the global production sharing at micro 
level (e.g., for a certain product such as iPhone) can be performed in a wide range of 
configurations, including a chain (or “snake”), star (or “spider”), or any network topology 
in between [12] . More importantly, at the aggregated sector level the GVCs are necessar¬ 
ily embedded in a global production network, where significant value-added contributions 
flow between sectors located in different countries. Any measure of the GVCs ignoring the 
network structure would incur a great loss of information, and so the GVCs can only be 
meaningfully compared if the network structure is accounted for. 

Our paper is also related to the longstanding literature on export similarity. Since the 
seminal work of Finger and Kreinin [15] , multiple measures of similarity have been intro¬ 
duced in the empirical study of international trade to calculate the overlap between the 
distributions of exports or imports by commodity groups of two countries to the market 
of third countries [16]. However, traditional measures of export similarity do not take into 
account the fragmentation of global production, which accounts for two-thirds of interna¬ 
tional trade [lj- 

To fill the gaps in the literature, we introduce a network-based measure of similarity be¬ 
tween the GVCs, which may provide possible insights into node clustering or community 
detection E3E8HH, link prediction (20] EE][22], and block modeling (23]EH[25] . Decades of 
literature has implemented measures of structural equivalence between nodes, with equiv¬ 
alent nodes strongly connected to the same neighbors [20, 23J. More recent work has 
focused on the concept of role equivalence, which relaxes the constraint that equivalent 
nodes depend on the identical neighbors and requires instead that they depend on other 
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equivalent nodes [20 [261 (23 EH] . Role equivalence gives a more generalized sense of the 
relationship between nodes by defining equivalence in a self-consistent fashion, but many 
of these approaches are defined only for undirected or unweighted networks and do not 
incorporate externally-defined node attributes (e.g., country or sector information that is 
available in the WIOD). In this paper, we develop a measure to identify the most intensive 
interactions among the GVCs across countries and over time incorporating the full network 
topology. 

To the best of our knowledge, our paper is the first attempt to measure and compare 
the GVCs at the sector level from a network-based approach. First, from a complex 
networks perspective, we map the World Input-Output Database (WIOD) [2DJ into both 
the upstream and downstream global value networks (GVNs), where the nodes are the 
individual sectors in different countries and the links are the value-added contribution 
relationships. Second, we introduce a network-based measure of node similarity to compare 
the GVCs between any pair of countries for each sector and each year available in the 
WIOD. Unlike the previous methods, we take into account all the direct and indirect 
relationships to calculate the GVCs similarity, which provides a more accurate and systemic 
comparison between the GVCs in space and time. This measure of similarity may shed 
light on many important topics of the GVCs, such as dependency, sustainability, risk, and 
competition associated with the GVCs. 

The rest of the paper is structured as follows. Section 2 describes the WIOD and constructs 
both the upstream and downstream GVNs and introduces the network-based measure of 
GVCs similarity. Section 3 summarizes and discusses the results and Section 4 concludes 
the paper. 


2 Data and Methods 

A network can be broadly defined as a set of items (nodes) and the connections between 
them (edges) [30, 3Tj. Recent years have witnessed a burgeoning body of research exploring 
topics in economics and finance from a network perspective [32U30IMIE51IMIEZU3S]. The 
set of sectors and the input-output relationships between them can also be considered as a 
interdependent network [38]. In this section we first map the WIOD into both the upstream 
and downstream global value networks (GVNs), where the nodes are the individual sectors 
in different countries and the links are the value-added contribution relationships. Notice 
that the GVNs are both directed (i.e., links going from value-added provider sectors to 
receiver sectors) and weighted (i.e., the share of value-added contribution varies from one 
link to another). As a result, the upstream (or downstream) value system of a sector can 
be obtained by searching for all the direct and indirect incoming (or outgoing) neighbors 
of the given sector in the upstream (or downstream) GVN. We then propose a measure of 
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GVC similarity that is applicable to the both directed and weighted GVNs with externally 
defined node attributes (the country and sector of the node) so that we can quantify how 
similar the GVCs are between any pair of countries for each sector and each year available 
in the WIOD. 

2.1 Data 

We use the recently available GMRIO database, the WIOD, to investigate the GVCs at 
the sector level mi- At the time of writing, the WIOD input-output tables cover 35 
sectors for each of the 40 economies (27 EU countries and 13 major economies in other 
regions) plus the rest of the world (RoW) and the years from 1995 to 2011. The 40 
economies are representative of the world economy in a sense that they produce around 
84.1% of the world GDP in 2011. Table [All and Table |A2l list the countries and sectors 
covered in the WIOD. For each year, there is a harmonized international input-output 
table listing the input-output relationships between any pair of sectors in any pair of 
economies. The numbers in the WIOD are in current basic (producers’) prices and are 
expressed in millions of US dollars. In a GMRIO table, the input-output flows between 
sectors is called the transactions matrix and is often denoted by Z. The rows of Z are 
the distributions of the sector outputs throughout the two economies, while the columns 
of Z are the distributions of inputs required by each sector. Note that sectors often buy 
inputs from themselves, due to the sector aggregation. Besides intermediate sector use, the 
remaining outputs are absorbed by the additional columns of final demand, which includes 
household consumption, government expenditure, etc. Similarly, production necessitates 
not only inter-sectoral transactions but also labor, management, depreciation of capital, 
and taxes, which are denoted by the value-added vector v. The final demand matrix is 
often denoted by F and the total sector outputs are denoted by the vector x. 

2.2 Construct the Global Value Networks 

Defining 1 a vector of l’s of conformable size (i.e. with the vector length appropriate for 
the multiplying matrix), and F • 1 = f, we can write the total global production as the 
production used for the internal dependencies and the final demand, x = Z-l + f. Dividing 
each column of Z by its corresponding total output in x produces the so-called technical 
coefficients matrix A, with the terminology signifying that they represent the technologies 
employed by the sectors to transform inputs into outputs. Replacing Z • 1 with Ax, we 
rewrite the output as x = Ax + f and find that x = (I — A) _1 f. The matrix (I — A) -1 is 
often denoted by L and is called the Leontief inverse [39., 10]. 

Dividing each element of v by its corresponding total output in x, we define the value- 
added share vector w. Defining the operation of a ’hat’ over a vector to result in a 


4 


diagonal matrix with the vector on its diagonal, the value-added contribution matrix can 
be computed as 

G = wLf (1) 

where G is the value-added contribution matrix and its element Gij is sector V s value- 
added contribution to sector j's total final demand, fj. The upstream value-added share 
matrix, U, is defined as the column-normalized version of G, 

U = G(G t 1) 1 (2) 

where the element Uij is sector V s share of value-added contribution out of sector j’s total 
final demand, fj. The downstream value-added share matrix, D, is similarly defined as 
the row-normalized version of G: 

D = (G1) _1 G (3) 

where the element Dij is sector j’s share out of sector j’s total value-added contribution. 
Note that the sum of each column of U is 1 while the sum of each row of D is 1. U 
identifies the shares of the value-added providers for any given sector while D identifies 
the shares of the value-added receivers for any given sector. Finally, the upstream GVNs 
are constructed by using U as the weight matrix while the downstream GVNs can be 
constructed with D as the weight matrix. Notice that the GVNs are directed, weighted, 
and contain self-loops. 

2.3 A Network-Based Measure of Node Similarity 

A wide range of similarity measures between nodes in a complex network have been devel¬ 
oped recently[2D] that could potentially be used to determine similar nodes in the GVNs. 
The simplest of these that are applicable to weighted networks include those defined by 
a comparison of the overlap of direct providers, with prominent examples including the 
weighted Jaccard coefficient JUJ or cosine similarity |20| between a pair of nodes P and 
Q (with each node representing a country-sector pair). These measures are respectively 
defined as 


Jpq = X] Qcs)/ X] max (Pcs, Qcs ) 

cs cs 

and 


(4) 


Cp Q = XX“ ?cs /[(XX«)E &)] 1/2 ( 5 ) 

CS CS CS 
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where p cs is the dependence of P on the country-sector cs (and a similar definition for q cs ) 
and the summation runs over all countries c and all sectors s that either P or Q depend 
on (i.e., all c and s for which p cs > 0 or q cs >0). In this paper, we will focus on a different 
but related similarity measure defined by 

c(0) _ Ecs [pIs + Qcs ~ (Pcs ~ Qcs) 2 } 

PQ Ec s\pIs + Q 2 cs + {Pcs-Qcs?} 

Jpq, Cpq and SpQ share a number of desirable properties in common: they are all strictly 
bounded between 0 and 1, with the value 0 attained iff P and Q have no providers in 
common and the value 1 attained iff P and Q receive from the same nodes by an identical 
amount. We further show in the Appendix this definition is strongly related to the defi¬ 
nition of the weighted Jaccard Coefficient and differs from the cosine similarity only by a 
different normalization. The general characteristics of these local measures of similarity are 
schematically diagrammed in Fig. [I] (A) for a hypothetical dependency network of German 
Construction (node P ) and Italian Construction (node Q). For all three, only identical 
dependencies between providing a contribution to the measure of similarity between P and 
Q. In this hypothetical example, Jpq = 0.25, Cpq « 0.647, and SpQ = 4/9. 

While purely local measures of similarity have been implemented in a wide range of studies, 
they are too limited to fully understand the relationship between national production 
systems because upstream providers that are ‘similar’ but not identical contribute nothing 
to the measure of similarity between P and Q. More meaningful information about the 
similarity between two production systems can be extracted by defining a measure of 
role equivalence [2011261 . 271 which implements a more self-consistent measure of similarity. 
Existing methods of measuring role equivalence may not be appropriate for the study of the 
GVCs, because the attributes of each node in the network cannot necessarily be treated on 
an equal footing. One might expect that a country-sector pair could change the nationality 
of its provider (for example, German construction exchanging its direct input from French 
construction to the construction sector in another nation), but not change the sector of 
the input (German construction could not replace its French construction input to another 
industrial sector, regardless of the nation of origin). The differing economic meanings 
behind the node attributes suggest that we develop a measure of similarity that explicitly 
takes these attributes into account (as in Spq). 


( 6 ) 
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Figure 1: Schematic diagrams of the methods of measuring the similarity between two 
nodes in the GVNs, with hypothetical dependencies of the French and German Construc¬ 
tion sectors (P = DE-c and Q = IT-c respectively) shown. Construction sectors are shown 
as squares and manufacturing sectors as triangles, while countries are represented by color 
(France is blue, Germany cyan, Spain brown, Britain red, and Italy gray). Dependency 
links that provide a significant contribution to the similarity between DE-c and IT-c are 
highlighted in yellow. In (A), we diagram structural similarity using purely local depen¬ 
dency information (as in Jpq, Cpq , and Spg), with the similarity between DE-c and IT-c 
due solely to the overlap between the identical provider of British Construction (GB-c). 
In (B), we show the sectoral dependency of the nodes are assumed identical (captured in 
Sp^), so all links contribute to the similarity if national differences are ignored. (C) shows 
an interpolation between these two extremes, where all upstream construction links for 
both DE-c and IT-c have the same provider (ES-c), making these providers similar, but 
the manufacturing links for DE-c and IT-c have different providers. 

The definition of SpQ in Fig. § represents a lower bound on any meaningful definition 
of role equivalence between country-sector pairs, because it treats each distinct national 
production system as completely different. We can define an upper bound for similarity 
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in a related manner by assuming that national systems of production are all completely 
identical instead of being completely distinct. This approximation is schematically dia¬ 
grammed in Fig. [T] (B), where sectors of production are considered distinct (Construction 
and Manufacturing are different fields) but national identities are treated as irrelevant. A 
measure of similarity equivalent to that in Eq. [6] can be developed in this approximation, 
with 


E s 

(E cPcsf + (T,c Qcs ) 2 - (TcPcs - Qcs ) 2 

Es 

(TcPcsf + (TcQcsf + (TcPcs - qcs ) 2 


E C s [Pis + Qc S ~ (Pcs ~ Qcs) 2 } + E s T pq(s) 
E CS [Pcs + Qcs + (Pcs - Qcs) 2 } + E a T P Q ( s ) 


where we have defined Tp Q (s ) = E c^d \PcsPds + qcsQc's ± (Pcs ~ q C s)(p c 's ~ Q&s)]- In Fig. 

[l] (B) it is straightforward to see that Spg = 1, because the inputs on the sectoral level 
are identical between German construction and Italian construction. We note that Eq. [7] 
is identical to Eq. |6jin the absence of the terms T^q(s) (a fact that is the primary reason 
for our choice in using this measure of similarity). 


The difference between perfect national similarity (Eq. [7]) and perfect national dissimilarity 
(Eq. jb) is entirely contained within the sector-dependent terms T^q(s), and we note that 
Tpg(s) is a sum over terms involving the direct relationship between P and Q to the 
countries c and d in sector s. In the context of a role equivalence calculation, these terms 
should not all be treated equally: country-sector pairs that are role-equivalent should 
contribute significantly to the similarity of P and Q , while country-sector pairs that are 
not role-equivalent should not contribute (diagrammed schematically in Fig. [T] (C)). This 
can be accomplished by weighting each term in the sum by the similarity between country 
c and d in sector s, and we thus write the self-consistent relation 


SpQ 


T s Ec,c' { [PcsPc's T qcsqps ( Pcs qcs){Pc's l/c's)] X iS CS)C ' s j- 
El o Ec, C ' {[PcsPc's T qcsqps T (pcs qcs ) (pc' S Qc's )] X EsjC's} 


( 8 ) 


as our final expression for the similarity between two country-sectors P and Q. It is 
straightforward to verify that the diagonal elements identically satisfy Spp = 1 for all 
country sector pairs P, and that Spg < Spq < SpX for all P and Q. If all countries 
are treated as different (with Spq = 0 for P / Q) Eq. [8] reduces to Eq. [6j whereas 
Eq. [8] reduces to Eq. [7] if all countries are assumed identical (with Spq = 1 for all 
countries). In the Appendix, we discuss some additional numerical properties of Eq. [8] and 
the algorithm we use to determine the numerical values of the similarity. Eq. [^incorporates 
a comparison between each of the direct providers of P and Q, but by weighting each term 
by the similarity implicitly includes a comparison between the indirect suppliers of P and 








Q (those that are providers of the providers). Two different direct providers of P and 
Q that themselves have similar inputs will have a large contribution to the similarity 
Spq, while direct providers who themselves have very different value chains will give a 
small contribution. This can be clearly seen by computing the similarity in Fig. [I] (C), 
where we numerically find Spq « 0.889 (in comparison to Spq ~ 0.444 and Spq = 
1). This shows that Eq. [8] captures our expectation that the similarities in the direct 
construction inputs due to the shared indirect link (Spanish construction) increases the 
similarity between German and Italian construction, but the dissimilarities in the direct 
manufacturing suppliers prevent a perfect role-similarity between them. 

The magnitude of Spq by itself cannot distinguish between similarity due to P and Q 
sharing identical providers versus sharing role-equivalent providers, we further define the 
rescaled similarity 


r PQ 


Spq Spq 

c(l) _ c(0) 
d pq j pq 


(9) 


which indicates how close Spq is to its upper bound Spq. Because the upper bound 

Spq completely ignores the national difference, if Rpq is very close to 1, it means that 
there is a significant national similarity between the sectors compared. In other words, the 
rescaled version allows us to attribute its magnitude to the national similarity of different 
nations. 


In this section we have only discussed the similarity based on the upstream GVNs, whose 
adjacency matrix U is both asymmetrical (directed) and real-valued between 0 and 1 
(weighted) and with non-zero diagonal elements (self-loops). Measuring a downstream 
similarity using the methods in this section can be equivalently accomplished by applying 
the same methodologies to the transposed downstream networks (reversing the direction 
of the links, so that receiver sectors become provider sectors). 


3 Results 

3.1 General Patterns of Similarity 

We compute the pairwise similarity across countries for each industry and each year avail¬ 
able in the WIOD. It is worthwhile to examine how strongly correlated our measure of 
similarity is with other alternative measures. They tend to be highly correlated, with the 
rescaled version of our measure of similarity more highly correlated with Cpq (.83 up¬ 
stream, .78 downstream) than with Jpq (.66 upstream, .72 downstream). Even though 
the correlation is high, it must be noticed that, unlike other local measures of similarity, 
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our measure of similarity takes into account both direct and indirect relationships along 
the value chain. In Fig. [2j we see that when we include indirect value-added providers in 
the computation of the upstream similarity, country-sector pairs become more similar to 
one another. Our network-based measure of similarity is much less correlated with cosine 
similarity than the local version (the lower bound Sp^) of our index, which differs from 
cosine only in the normalization term. 



0.5 

cosine 


similarity 

similarity(O) 


Figure 2: The scatter plot of cosine similarity (x-axis) vs. our measures of similarity for all 
pairwise comparison of sectors across countries for the upstream GVNs and for all years. 
Both SpQ and Spq are reported in red and blue respectively. 

We explore the evolution of the similarity between sectors by computing the mean simi¬ 
larity for all sectors and country pairs, XX XX^c' Scs,c's/N s N c (N c — 1), with N c = 41 the 
number of countries and N s = 35 the number of sectors. Fig. [3] reveals that, on average 
sectors across the globe tend to be more similar over time, a fact that is consistently ob¬ 
served using all measures of similarity. All measures also show that upstream similarity is 
more volatile and less intense than the downstream similarity. However, when all network 
interdependences are taken into account, the upstream and downstream similarities tend 
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to more closely follow the same path of growth and both exhibit a temporary reduction in 
the aftermath of the great recession in 2008 (the latter is also captured by Jaccard). 






years years 


Figure 3: The evolution of average similarity across countries and sectors over time, 1995- 
2011. We compare four different measures of similarity: Jaccard [Jpq\, Cosine [Cpq], 
Similarity(0) [<Spg], and Network Similarity [S'pq]. For every indicator, we report both 
upstream (solid lines) and downstream (broken lines) similarity. 


For each year, we can average across countries to have the average similarity for each 
industry, Scs,c's/N C (N C — 1). Fig. 4]shows both the average upstream and downstream 

similarities for all the sectors and for the years 1995 and 2011. It is straightforward to see 
that most sectors have increased their similarities over time as most “arrows” are pointing 
to the northeast direction. Sectors like “Coke, Refined Petroleum and Nuclear Fuel (Cok)” 
have high average upstream similarity and relatively low average downstream similarity, 
which means that it is more likely to find country-sector overlap in their upstream value 
chains. This makes sense for the sector “Cok”: energy providers tend to be concentrated 
in only a few countries. More generally, the manufacturing sectors tend to be more similar 
across countries than the services sectors as the former is clustered in the top right of Fig. 
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[4] and the latter is clustered in the lower left of Fig. |4j 


ID 

Full Name 

ID 

Full Name 

Agr 

Agriculture, Flunting, Forestry and Fishing 

Sal 

Motor Vehicles and Motorcycles; Fuel 

Min 

Mining and Quarrying 

Whl 

Wholesale Trade and Commission Trade 

Fod 

Food, Beverages and Tobacco 

Rtl 

Retail Trade; Repair of Household Goods 

Tex 

Textiles and Textile Products 

Htl 

Hotels and Restaurants 

Lth 

Leather, Leather and Footwear 

Ldt 

Inland Transport 

Wod 

Wood and Products of Wood and Cork 

Wtt 

Water Transport 

Pup 

Pulp, Paper, Paper, Printing and Publishing 

Ait 

Air Transport 

Cok 

Coke, Refined Petroleum and Nuclear Fuel 

Otr 

Auxiliary Transport Activities; Travel Agencies 

Chm 

Chemicals and Chemical Products 

Pst 

Post and Telecommunications 

Rub 

Rubber and Plastics 

Fin 

Financial Intermediation 

Omn 

Other Non-Metallic Mineral 

Est 

Real Estate Activities 

Met 

Basic Metals and Fabricated Metal 

Obs 

Renting of M&Eq and Other Business Activities 

Mch 

Machinery, Nec 

Pub 

Public Admin and Defence; Social Security 

Elc 

Electrical and Optical Equipment 

Edu 

Education 

Tpt 

Transport Equipment 

Hth 

Health and Social Work 

Mnf 

Manufacturing, Nec; Recycling 

Ocm 

Other Community, Social and Personal Services 

Ele 

Electricity, Gas and Water Supply 

Pvt 

Private Households with Employed Persons 

Cst 

Construction 




Figure 4: The average upstream and downstream similarities of sectors for the years 1995 
and 2011 using a logarithmic scale. 


For each year, we can also average across industries and foreign economies Ylc^c' S C s,c's/N s (N c — 
1)) to define an mean similarity for each nation. Fig. [5] shows both the average upstream 


12 

















and downstream similarities for all the countries and for the years 1995 and 2011. Again, 
we observe a general increasing trend of the similarities (see the change of the axis range 
over time). Furthermore, the “Asian miracle” economies, South Korea and Taiwan, are 
clearly associated with high average similarities when compared with other countries. As 
in the study of Ref. |42j , we also find that China has been increasingly involved in the 
vertical specialization and has made a dramatic move over time that it has joined the other 
“Asian miracle” nations in terms of the similarities. In the Appendix, we further report 
the clustering results based on the average similarities of countries. 

1995 2011 
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Figure 5: The average upstream and downstream similarities of countries in years 1995 
and 2011 and using logarithmic axes. 


3.2 Specific Case Studies 

A convenient way to organize our results is to show the country-by-country matrix of 
pairwise similarities for specific sectors and years. Fig. [6] is an example for the upstream 
rescaled similarity and the downstream rescaled similarity for the electrical engineering 
sector, “Elc” (see |43l 144) for a recent analysis of the same sector). Notice that, by our 
definition of similarity, the matrix is symmetrical and has all l’s in its diagonal. There is 
a visually clear increase in the similarity between most nations in “Elc” between 1995 and 
2011, and many economies that were very dissimilar in 1995 became very similar in 2011 
(with China being a prominent example). In 1995, China is neither upstream-similar nor 
downstream-similar to any other countries as its corresponding rows or columns are barely 
colored. In 2011, however, China becomes fairly upstream-similar to Czech Republic, 
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Hungary, Mexico, Slovak, Taiwan, etc, with Czech Republic as its most upstream-similar 
country. On the other hand, China becomes highly downstream-similar to South Korea 
and Taiwan, with Taiwan as its most downstream-similar country. 

To see the dynamics at a finer resolution, we show the significant first-degree neighbors 
(i.e., those with link weight no less than 0.005) of the electrical equipment sector in China 
and Czech Republic in the upstream GVNs in 1995 and 2011 in Fig. [7J and in Fig. [8] the 
significant first-degree neighbors of “Elc” in China and Taiwan in the downstream GVNs in 
1995 and 2011. Note that while our measure of similarity takes into account all the indirect 
neighbors, we only show the first-degree neighbors in Figs. [7]|8]for better visualization. Over 
time, the number of shared value-added providers between China and the Czech Republic 
has increased, and a direct interaction between the two sectors becomes significant as a 
new link is formed between them. Likewise, the number of shared value-added receivers 
increases between China and Taiwan over time. 

Eq. [8] can give valuable insights into which sectors are responsible for the increased similar¬ 
ity. We can decompose the numerator of Eq. [8] into individual terms and examine exactly 
how much each pair of country-sectors contributes to the similarity score. We divide the 
country-pairs into three categories: purely internal (both countries either China or Czech 
Republic for the upstream case), purely external (neither country China nor Czech Repub¬ 
lic for the upstream case), and mixed (one either China or Czech Republic and the other 
a different country for the upstream case). Fig. [9] (A) shows the purely internal share of 
the upstream similarity (dashed lines) and the rescaled upstream similarity between the 
electrical equipment sector in China and the one in Czech Republic over time. The purely 
internal share is well correlated with the upstream similarity in this case (both are increas¬ 
ing in time), which implies that more intensive direct interaction between China and the 
Czech republic is the main driving force behind their increased similarity. This is indeed 
supported by Fig. [T] (A), where the electrical equipment sectors in China and the Czech 
Republic form a significant direct link between themselves in 2011. Fig. [9] (B) shows the 
purely internal share of the downstream similarity and the rescaled downstream similarity 
between the electrical equipment sector in China and the one in Taiwan over time. Un¬ 
like the upstream case between China and Czech Republic, the purely internal share is 
not well correlated with the rescaled downstream similarity, suggesting that the overlap of 
foreign sectors is likely responsible for their increased similarity instead of shared internal 
connections. 
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Figure 6: The pairwise upstream and downstream similarity across countries of the elec¬ 
trical equipment sector in 1995 and 2011. Darker color indicates higher values. In 1995, 
China is not very similar to any other countries. In 2011, the most similar countries to 
China are Czech Republic (upstream) and Taiwan (downstream). 
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Figure 7: The first-degree neighbors of the electrical equipment sector in China and Czech 
Republic in the upstream GVNs in 1995 and 2011. Any incoming links to the two sectors 
with weight greater than or equal to 0.005 are shown. Over time, the number of shared 
value-added providers increases for the two sectors, and the direct interaction between the 
two sectors becomes significant as a new link is formed between them in 2011. 
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Figure 8: The first-degree neighbors of the electrical equipment sector in China and Taiwan 
in the downstream GVNs in 1995 and 2011. Any outgoing links from the two sectors with 
weight greater than or equal to 0.005 are shown. Over time, the number of shared value- 
added receivers has increased for the two sectors. 
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Figure 9: (A) The purely internal share of the upstream similarity (dashed line) and the 
rescaled upstream similarity (solid line) between the electrical equipment sector (“Elc”) in 
China and the Czech Republic from 1995-2011. (B) The purely internal share of the down¬ 
stream similarity and the rescaled downstream similarity between the electrical equipment 
sector in China and Taiwan from 1995-2011. The purely internal share is well correlated 
with the rescaled upstream similarity between China and the Czech Republic, but they are 
not correlated in the downstream case between China and Taiwan. 


4 Concluding remarks 

In recent decades, international trade has been marked by the spatial fragmentation of 
production, which is captured by the notion of global value chains (GVCs). A good under¬ 
standing of the evolution of the GVCs is of vital importance for the macro decision makers 
to design proper and timely policies and for the micro decision makers to engage in and 
benefit from the revolution [35] . A method of measuring and comparing the GVCs in a 
systematic way is necessary for informed decisions on both scales, but about which the 
existing literature remains silent. This paper has aimed to fill this gap in the literature. 
First, we use the World Input-Output Database (WIOD) to construct both the upstream 
and downstream global value networks where the nodes are the individual sectors in dif¬ 
ferent countries and the links are the value-added contribution relationships. Second, to 
systematically compare the GVCs, we define a network-based measure of role equivalence 
that takes the differing types of attributes of each node into account. Our measure of simi¬ 
larity assumes that while it is possible to exchange the nationality of a direct provider in a 
particular sector, the sectors themselves are not interchangeable. Coupling this expectation 
with naturally-defined lower and upper bounds on similarity permitted the self-consistent 
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definition of similarity. 


We have found that manufacturing sectors tend to be more similar across countries than 
the services sectors while countries like China has increased its average similarity over 
time. As a case study, we found that the sector of electrical equipment in China has 
become upstream-similar to the one in Czech Republic and downstream-similar to the one 
in Taiwan. Our measure of similarity enables us to identify the most intensive interactions 
among the GVCs across countries and over time. However, the driving forces behind 
the interactions can be either internal or external, which can be interpreted as value- 
chain integration or value-chain competition accordingly. Identifying and quantifying these 
differences will be left for future work. 

Regarding the potential uses and policy implications of our measure of GVCs similarity, we 
expect that the GVC similarity will be a better measure than the export/import similarity 
(measured without reference to the topology of the global network). The latter has been 
largely used in the trade literature as a proxy for competition and trade diversion between 
countries. However, the gross trade statistics can be seriously flawed (by double counting) 
as the global production sharing has become a norm. In addition, the trade diversification 
measured by the export/import similarity has become a less reliable indicator of a country’s 
competitiveness because similar GVCs are compatible with very dissimilar export outputs 
(as was the case for China). Our measure may also be useful as a predictor for future link 
formation using the link prediction literature in the field of complex networks mmm, 
where high similarity between the country-sector pairs identified by our measure may 
suggest an increasingly intense value-added relationships in the future. Finally, since the 
GVCs tend to become more similar over time and countries tend to become more vertically 
specialized, there are concerns about the systemic risk of the global production system. 
Integration and diversification are two important features for the stability of input-output 
systems [36] • Our results suggest that effective diversification is lower than expected due 
to the increasing overlap of trading partners along value chains, and hence increases the 
risk of instability. 

Some possible future extensions to this paper include quantifying the driving forces behind 
the dynamics of similarity, as mentioned in the previous section. Our approach can be 
generalized to networks with more than two types of node attributes, and so long as it 
is possible to meaningfully define the lower and upper bounds on the similarity given the 
constraints of the differing attributes it may be of interest to define a similar measure of self- 
consistent similarity. This approach can also be modified to incorporate other economically 
relevant information. For example, the greater reliance that a sector typically has on itself 
and the domestic economy at large (in comparison to foreign sectors) may suggest that 
differentiating between domestic and foreign sectors and treating self-loops differently may 
be appropriate. In these cases, adapting the upper and lower bounds found in Eq. [7] and [6] 
to meaningfully capture the differences between foreign and domestic or between self- and 
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non-self-dependence should naturally give rise to an equivalent self-consistent measure of 
similarity. 


5 Appendix 

5.1 WIOD Coverage 


Table Al: List of WIQD economies. 


Euro-Zone 

Non-Euro EU 

NAFTA 

East Asia 

BU TT AT 

Economy 

3L Code 

Economy 

3L Code 

Economy 

3L Code 

Economy 

3L Code 

Economy 

3L Code 

Austria 

AUT 

Bulgaria 

BGR 

Canada 

CAN 

China 

CHN 

Australia 

AUS 

Belgium 

BEL 

Czech Rep. 

CZE 

Mexico 

MEX 

Japan 

JPN 

Brazil 

BRA 

Cyprus 

CYP 

Denmark 

DNK 

USA 

USA 

South Korea 

KOR 

India 

IND 

Estonia 

EST 

Hungary 

HUN 



Taiwan 

TWN 

Indonesia 

IDN 

Finland 

FIN 

Latvia 

LVA 





Russia 

RUS 

France 

FRA 

Lithuania 

LTU 





Turkey 

TUR 

Germany 

DEU 

Poland 

POL 







Greece 

GRC 

Romania 

ROM 







Ireland 

IRL 

Sweden 

SWE 







Italy 

ITA 

UI< 

GBR 








Luxembourg LUX 

Malta MLT 

Netherlands NLD 

Portugal PRT 

Slovakia SVK 

Slovenia SVN 

Spain ESP 
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Tabic A2: List of WIQD sectors 


Full Name 

ISIC Rev. 3 Code 

WIOD Code 

3-Letter Code 

Agriculture, Hunting, Forestry and Fishing 

AtB 


cl 


Agr 

Mining and Quarrying 

c 


c2 


Min 

Food, Beverages and Tobacco 

15tl6 


c3 


Fod 

Textiles and Textile Products 

17tl8 


c4 


Tex 

Leather, Leather and Footwear 

19 


c5 


Lth 

Wood and Products of Wood and Cork 

20 


c6 


Wod 

Pulp, Paper, Paper , Printing and Publishing 

21t22 


c7 


Pup 

Coke, Refined Petroleum and Nuclear Fuel 

23 


c8 


Cok 

Chemicals and Chemical Products 

24 


c9 


Chm 

Rubber and Plastics 

25 


clO 


Rub 

Other Non-Metallic Mineral 

26 


ell 


Omn 

Basic Metals and Fabricated Metal 

27t28 


cl2 


Met 

Machinery, Nec 

29 


cl3 


Mch 

Electrical and Optical Equipment 

30t33 


cl4 


Elc 

Transport Equipment 

34t35 


cl5 


Tpt 

Manufacturing, Nec; Recycling 

36t37 


cl6 


Mnf 

Electricity, Gas and Water Supply 

E 


cl7 


Ele 

Construction 

F 


cl8 


Cst 

Sale, Maintenance and Repair of Motor Vehicles and Motorcycles; Retail Sale of Fuel 

50 


cl9 


Sal 

Wholesale Trade and Commission Trade, Except of Motor Vehicles and Motorcycles 

51 


c20 


Whl 

Retail Trade, Except of Motor Vehicles and Motorcycles; Repair of Household Goods 

52 


c21 


Rtl 

Hotels and Restaurants 

H 


c22 


Htl 

Inland Transport 

60 


c23 


Ldt 

Water Transport 

61 


c24 


Wtt 

Air Transport 

62 


c25 


Ait 

Other Supporting and Auxiliary Transport Activities; Activities of Travel Agencies 

63 


c26 


Otr 

Post and Telecommunications 

64 


c27 


Pst 

Financial Intermediation 

J 


c28 


Fin 

Real Estate Activities 

70 


c29 


Est 

Renting of M&Eq and Other Business Activities 

71t74 


c30 


Obs 

Public Admin and Defence; Compulsory Social Security 

L 


c31 


Pub 

Education 

M 


c32 


Edu 

Health and Social Work 

N 


c33 


Hth 

Other Community, Social and Personal Services 

0 


c34 


Ocm 

Private Households with Employed Persons 

p 


c35 


Pvt 

5.2 Relationship with Jaccard and Cosine Similarities 

There are many possible ways of measuring the similarity between nodes in a weighted 
network using information involving only their nearest neighbors, with the Jaccard[5T] and 
Cosine[20] similarities being often used. We have chosen to use Eq. [6j and in this section 
we show its relationship to both the Jaccard and Cosine similarities. It is a mathematical 


identity that 

J = 52cs min(p«, Qcs) 

PQ 'Lcs max (pcs-,Qc,s) 


X/CS \P CS 1?CS I Pcs Qcs|] 

Xcs IjPcs T Qcs + I Pcs ~~ <7cs|] 


( 10 ) 
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with the numerator and denominator differing only in a change of sign on the terms in¬ 
volving the absolute value of p cs — q cs . Jpq satisfies the useful property that 0 < Jpq < 1 
with the equalities occurring iff P and Q have either no weight to identical nodes or all 
identical weights. Many other functional forms satisfy this requirement, though, with a 
family of examples being 


Ec, [P?, + - \Pcs ~ fe|°] ,,,, 

FQ E„[pa + «;a + |pc-fci“] ' ’ 

for all a > 0, with Eq. [6] coinciding with the choice of a = 2. Due to the convenient link 
between Eqs. [6] and [7] that could exist only with the choice of a = 2, there is utility in 
selecting this specific value of a. We further note that for a = 2, the numerator of Eq. [Tljis 
Zcptip + Qcp — (Pep — Qcp ) 2 = 2 J^cpPcpQcp, exactly twice the numerator in the definition of 

Cosine similarity. While Spg and Cpq have differing normalizations, we naturally expect 
that these measures of similarity will be highly correlated. The high degree of similarity 
between the definitions of SpQ, Jpq, and Cpq suggests that the usage of SpQ is reasonable 
as a measure of similarity. 


5.3 Computational Algorithm 


The definition of similarity in Eq. [8] is not analytically tractable due to its nonlinearity, and 
approximate methods for determining the similarity between countries in specific sectors. 
We use an iterative method to solve for Spq, by defining the (k + l) th iteration of the 
similarity as 


SpQ-k+l 


V, c X)c,c' { [PcsPc's T QcsQc's ( Pcs Qcs)(Pc's Qc's )] 5 < cs,c , s;fc} 
y', c £c,c' {[PcsPc's T QcsQc's T ( Pcs Qcs)(Pc's Qc's)\ Scs,c's;k} 


( 12 ) 


In the results presented in this paper, we set Spq-o = Spq as the initial value of the simi¬ 
larity. This iteration is continued until maxpQ(|SpQ ; fc + i — SpQ-k |) < 0.001, at which point 
the algorithm is assumed to have converged. This relatively high convergence tolerance is 
due to the computational complexity of the similarity: there are ~ N s x Ay (each sector 
and each pairing of countries for each year) similarities that must be computed, and each 
requires at on the order of N s X operations (the number of terms in the sums in Eq. [8]). 
This leads to a computational time scaling as A^A,f(~ 3 x 10 9 operations for N s = 35 and 
N c = 41) to compute one iteration of the of the algorithm. Convergence to the threshold 
occurred after ~ 30 minutes on a desktop computer (with the algorithm written in C+-b), 
and was evaluated on 17 years of data. 


The method does converge exponentially fast as a function of the iteration (shown in Fig. 


A1), and the similarities can be computed after a few hours on a single desktop. We also 


22 





compared the values of similarity generated using the initial condition Spq-q = Spg with 

that using the initial condition Spq-q = Spg (defined in Eq. 7), and found that the largest 
difference between the two measured similarities was on the order of 0.001, the convergence 
threshold. This is consistent with the expectation that the algorithm converges to a unique 
solution. 



iteration 


Figure Al: The convergence of the algorithm as a function of the iteration. Each line 
denotes the maximum difference maxpg(|S'pQ ; fe+i — SpQ-k |) as a function for the 17 years 
(1995-2011) on log-linear axes. 

5.4 Clustering Countries Based on Similarity 

Blockmodeling tools have been developed in the literature to partition network nodes into 
clusters according to structural, automorphic and regular equivalence or other notions of 
similarity. The network data are converted into a (dis) similarity matrix, after which some 
clustering algorithm is applied. In the following we show the clustering of countries after 
our measure of similarity is applied to compute the distance matrix between countries. We 
detect some interesting changes over time such as the emergence of a German cluster of up¬ 
stream interdependencies and the reconfiguration of the relationships among the European 
countries after the Fifth Enlargement of the European Union in years 2004-2007. 
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Figure A2: Dendrogram of countries based on unweighted average distance clustering. 
Distance has been computed as one minus average upstream similarity across all sectors in 
year 1995. Coloring is used to highlight different clusters at a 1.38 cutoff for inter-group 
dissimilarity. Countries are identified by means of the corresponding 3-characters ISO code. 



Figure A3: Dendrogram of countries based on unweighted average distance clustering. 
Distance has been computed as one minus average upstream similarity across all sectors 
in year 2011. Coloring is applied to highlight different clusters at a 1.365 cutoff for inter¬ 
group dissimilarity. Countries are identified by means of the corresponding 3-characters 
ISO code. 
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Figure A4: Dendrogram of countries based on unweighted average distance clustering. 
Distance has been computed as one minus average downstream similarity across all sectors 
in year 1995. Coloring is used to highlight different clusters at a 1.38 cutoff for inter-group 
dissimilarity. Countries are identified by means of the corresponding 3-characters ISO code. 
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Figure A5: Dendrogram of countries based on unweighted average distance clustering. 
Distance has been computed as one minus average downstream similarity across all sectors 
in year 2011. Coloring is applied to highlight different clusters at a 1.365 cutoff for inter¬ 
group dissimilarity. Countries are identified by means of the corresponding 3-characters 
ISO code. 
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